Subset Seed Extension to Protein BLAST

نویسندگان

  • Anna Gambin
  • Slawomir Lasota
  • Michal Startek
  • Maciej Sykulski
  • Laurent Noé
  • Gregory Kucherov
چکیده

A bstract: The seeding technique became central in the theory of sequence alignment and there are several efficient tools applying seeds to D N A homology search. Recently, a concept of subset seeds has been proposed for similarity search in protein sequences. We experimentally evaluate the applicability of subset seeds to protein homology search. We advocate the use of multiple subset seeds derived from a hierarchical tree of amino acid residues. Our method computes, by an evolutionary algorithm, seeds that are specifically designed for a given protein family. The representation of seeds by deterministic finite automata (D FA s) is developed and built into the N C B I-B L A ST software. This extended tool, named SeedB L A ST, is compared to the original N C B I-B L A ST and PSI-B L A ST on several protein families. Our results demonstrate a superiority of SeedB L A ST in terms of efficiency, especially in the case of twilight zone hits. SeedB L A ST is an open source software freely available http://bioputer.mimuw.edu.pl/papers/sblast. Supplementary material and user manual are also provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effect of seed pre-soaking on compensation of late planting of two forage sorghum (Sorghum bicolor (L.) Moench) cultivars in second cropping

To evaluate the effect of seed pre-soaking on forage yield and quality and water productivity in late planting of two forage sorghum cultivars, a field experiment was conducted was conducted as split factorial arrangements in randomized complete block design with three replications in 2017 and 2018 growing seasons at the research field of Seed and Plant Improvement Institute, Karaj, Iran. Four ...

متن کامل

Mercury BLASTN: Faster DNA Sequence Comparison using a Streaming Hardware Architecture

Motivation: Large-scale DNA sequence comparison, as implemented by BLAST and related algorithms, is one of the pillars of modern genomic analysis. One way to accelerate these computations is with a streaming architecture, in which processors are arranged in a pipeline that replicates the multistage structure of the algorithm. To achieve high performance, the processor hardware implementing the ...

متن کامل

The reaction of 109 rice lines to blast disease

Shahbazi H, Tarang A, Padasht F, Hosseini Chaleshtari M, Allah-Gholipour M,  Khoshkdaman M, Mousavi Qaleh Roudkhani SA, Nazari Tabak S, Asadollahi Sharifi F, Pourabbas Dolatabad M (2022) The reaction of 109 rice lines to blast disease. Plant Pathology Science 11(1):24-35.  Doi: 10.2982/PPS.11.1.24.   Introduction: Blast caused by Pyricularia oryzae is the most important fungal disease of ri...

متن کامل

Impact of Storage Fungi on Soybean Seed Deterioration in Different Storage Conditions and Seed Moisture Content

DOR: 98.1000/2383-1251.1398.6.65.11.1. 1578.1585 Extended Abstract Introduction: Understanding the complex characteristics that control the life span of the seed has ecological, agricultural and economic importance. Inappropriate storage conditions after harvesting destroy a large part of annual yield partly due to microbial activity in the storage. Damage from storage fungi varies based ...

متن کامل

The Use of Vector Seeds to Improve PSI - BLAST Sensitivity

PSI-BLAST [5] is widely used for searching protein databases for sequence similarities, especially distant homologies. Position specific score matrices are constructed during its running. These give the program the power to capture remote relations of a query sequence. PSI-BLAST needs multiple iterations in most circumstances and is time-consuming. Here, we modified the vector seed optimizing a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011